Historic, archived document
Do not assume content reflects current
scientific knowledge, policies, or practices
UNITED STATES DEPARTMENT OF AGRICULTURE
BULLETIN No. 504
OFFICE OF THE SECRETARY
Contribution from the Office of Farm Management
W. J. SPILLMAN, Chief
Washington, D. C. PROFESSIONAL PAPER May 23, 1917
THE THEORY OF CORRELATION AS APPLIED TO FARM-
SURVEY DATA ON FATTENING BABY BEEF.
By H. R. Tolley, Scientific Assistant.
CONTENTS.
Page.
Introduction 1
Theory of correlation 1
Computation of the coefficients, 6
Page.
Interpretation of the coefficients 8
Summary 13
INTRODUCTION.
This paper sets forth the results of an experiment in applying the
theory of correlation, hitherto used chiefly in the analysis of biologi-
cal, sociological, psychological, and meteorological statistics, 1 to the
study of some of the data of the Office of Farm Management.
The material for the investigation was obtained from 67 records
taken during the years 1914 and 1915 from farmers of the corn belt
who were fattening baby beef for market. 2 The factors considered
were: The profit or loss per head, the weight, value per hundred-
weight, value of feed consumed per head, cost at weaning time, and
date of sale (see Table L). Coefficients of correlation were computed
for every pair of these factors and used as a measure of the relation-
ship existing between them.
THEORY OF CORRELATION.
The writer will not attempt a detailed explanation of the theory
of correlation but will discuss briefly the meaning of coefficients of
correlation and the method by which they are obtained.
1 Yule, G. U. : " Introduction to the Theory of Statistics," 1912. Yule, G. U. : " On the
Theory of Correlation," Jour. Roy. Stat. Soc, 1897, p. 812. Davenport, C. B. : " Statisti-
cal Methods, With Special Reference to Biological Variation," 1914. Hooker, R. H. :
" The Correlation of the Weather and the Crops," Jour. Roy. Stat. Soc, 1907, p. 1.
Smith, J. W. ; " Effect of Weather on Yield of Corn," Monthly Weather Review, vol. 42,
p. 72 ; and " Effect of Weather on Yield of Potatoes," ibid., vol. 43, p. 232. Brown, Wm. :
" Essentials of Mental Measurement," 1911.
2 For detailed account of the methods by which these data were obtained and the costs
computed, see Report 111, Office of the Stcretary, 1916.
70070°— 17
2 BULLETIN 504, IT. S. DEPARTMENT OF AGRICULTURE.
Table I. — Data on cost of producing baby beef.
Farm No.
Profit per
Weight
Value per
Total value
of feed per
head.
Cost per
head at
Date of sale
head. 1
per head.
hundred-
weaning
(months). 2
weight.
time.
Pounds.
1
+$12. 07
785
S8.35
$31. 49
523. 12
8.7
2
- 22.98
750
7.75
29.62
46.33
6.3
3
+ 2.79
690
7.20
20.90
30.94
2.6
4
+ 6.07
820
8.00
30.00
30.02
6.4
5
- 14.05
S52
7.52
31.88
44.15
12.7
6
- 9.93
1,000
9.75
37.47
64.34
12.3
7
+ 13.68
825
8.50
32.68
25.76
8.8
8
+ 15. 15
825
8.50
23.04
32.64
8.8
9
+ 27.42
800
9.50
31.06
18.01
8.5
10
- 8.92
875
9.75
59.10
33.92
8.8
11
- 19.09
922
10.14
70.52
40.20
9.9
12
+ 18.75
810
9.30
30. 25
28.08
9.0
13
- 7.07
1,080
9.75
71.01
38.62
8.9 .
14
+ 13.53
1,048
10.11
47.43
39.83
10.0
15
+ 38.15
1,012
10.35
49.47
20.47
10.9
16
+ 9.83
1,000
9.75
40.56
42.89
10.0
17
+ 19.05
807
9.70
33. 58
28.43
4.3
18
- 5.73
915
9.10
38.41
51.36
8.0
19
- 2.39
910
9.15
45.48
42.17
5.0
20
+ 10.93
890
9.70
48.95
25.86
8.2
21
+ 9.67
876
9.40
23.91
43.98
8.7
22
— 3.65
988
8.75
43.95
38.52
12.2
23
+ 49.37
1,050
9.75
27.08
27.74
10.3
24
- 7.28
798
8.25
42.84
30.09
8.0
25
- 43.00
675
8.00
44.71
50.80
6.6
26
+ 5.65
689
7.75
23.39
24.61
5.5
27
+ 0.59
860
10.00
49.74
35. 85
7. 5
28
- 7. 71
746
8.00
, 26.53
40.30
6.7
29
- 2.78
850
8.90
33.35
43.46
6.5
30
- 1.36
890
7.30
20.33
49.80
3.5
31
+ 6.76
859
8.15
27.73
30.89
4.0
32
+ 7.91
765
8.10
21.02
30.83
6.0
33
+ 18.07
744
9.48
34.95
19.51
7.2
34
+ 1.33
700
9.00
31.61
28.25
5.0
35
- 32.63
740
9.25
45.72
49.76
5.9
36
+ 12.97
800
9.00
31.00
26.57
6.5
37
+ 11.15
800
7.30
17.96
27.11
2.7
38
- 43.71
740
8.50
20.12
85.66
5.0
39
+ 18.15
700
7.75
9.38
24.78
3.8
40
- 9.86
785
8.25
33. 70
38.73
8.8
41
+ 2.18
656
8.14
21.90
30.09
4.9
42
+ 23.99
925
8.60
26.75
28.55
5.9
43
- 22.97
766
8.40
54.46
35.09
6.9
44
- 12.73
750
8.50
18.02
54. 33
7.0
45
+ 11.80
805
8.00
20.76
34.62
5.0
46
- 22.90
924
9.85
57.59
56.63
8.0
47
+ 0.27
800
9.25
48.53
28.97
7.6
48
+ 5.37
800
8.25
29.79
29.03
5.7
49
+ 5.33
862
10.20
49.77
30.90
10.7
50
+ 2.82
800
9.00
24.81
42.86
7.0
51
+ 16.68
840
10.00
45.54
19. 54
8.0
52
- 7.07
840
9.25
41.09
37.53
7.7
53
- 3.09
650
7.50
12.28
41.66
3.0
54
- 24.04
775
8.40
39.86
47.90
6.0
55
- 10.45
741
8.50
29.04
45.68
6.0
56
+ 2.83
768
7.20
24. 53
29.62
6.0
57
- 0.09
1.060
8.30
47.10
45.41
3.0
58
+ 3.88
900
8.95
37.02
42.27
6.0
59
- 6.42
793
9.60
56.46
27.69
8.5
60
- 8.37
855
8.55
42.40
40.51
5.5
61
- 27.61
850
8.05
43.83
51.29
6.0
62
+ 1.64
915
8.55
34.41
39.39
6.2
63
- 0.18
811
8.60
37.13
32.90
8.9
64
- 1.00
775
8.20
18.30
43. 72
5.0
65
+ 8.00
742
8.25
17.39
34.85
6.9
66
+ 5.55
827
8.50
22.30
42.05
7.3
67
Average..
+ 21.73
950
9.50
33.64
32.09
12.0
+ 0.78
S34
8.76
35.02
37.01
7.2
1 A plus sign before the quantity in this column indicates a gain; a minus sign, a loss.
2 In order to facilitate computation, the dates have been expressed in months and decimals of a month
after Jan. 1; i. e., 8.7 indicates Aug. 20, 21, or 22; 6.3 indicates June 8, 9, or 10, etc.
CORRELATION" AS APPLIED TO FARM-SURVEY DATA. 3
If, in two series of associated variables, as, say, the profit per
head and the weight per head in the data under consideration, there
is a tendency for a high value of the first to be associated with a
high value of the second, the variables are said to be correlated, and
the correlation is positive; while if a high value of the first is asso-
ciated with a low value of the second, and vice versa, the correlation
is said to be negative, and the best measure yet devised of the amount
of the correlation is the so-called coefficient of correlation. In Table
II is shown the calculation of the coefficient of correlation between
profit and weight per head.
The method is as follows :
1. Find the average value for each of the variables. Here the, average
profit per head is $0.78, and the average weight 834 pounds.
2. Calculate the departure of the individual values from the average. In
the case of record No. 1, the departure of the profit from the average is +$11.29,
and of the weight, —49 pounds.
3. Find the square root of the average of the squares of these departures.
This is the so-called " standard deviation," and is a measure of dispersion or
the amount of variability of each variable.
4. Find the algebraic sum of the products of each pair of individual depart-
ures, i. e., for each record, multiply the departure of the profit from the average
by the departure of the weight from the average, and prefix the proper sign;
then find the difference between the sum of all the plus products and the
sum of all the minus products.
5. Divide this result by the number of records and the standard deviation
of each of the variables in turn, prefix the proper sign, and the figure obtained
is the coefficient of correlation between the two factors under consideration.
If there are approximately the same number of positive and nega-
tive products and they are of the same size, it will be evident that
there is no correlation, and this will be shown by the fact that the
coefficient of correlation will be zero, or nearly so. If high values
of the first variable are associated with high values of the second,
and low values of the first with low values of the second, most of
the products will be plus, and the greater their sum the closer will
be the correlation and the larger will be the coefficient obtained.
If a value of one variable below the average is generally associated
with a value of the other above the average, the correlation will
evidently be negative, and this will be shown by the fact that the
sum of the products will be negative, the degree of the correlation
and the size of the coefficient depending upon the size of this sum.
Expressed algebraically, the coefficient of correlation,
.■--?*-; (i)
where Hxy is the sum of the products above mentioned, n is the num-
ber of pairs of variables (the same as the number of records) ; a x
BULLETIN 504, U. S. DEPARTMENT OF AGRICULTURE.
Table II. — Calculation of coefficient of correlation between profit and weight
per head.
Farm Xo.
Profit
per head. 1
X.
[
Weight
per head.
y.
Y 2 -
xy.
Pounds.
1
+S12.07
+11.29
+127.69
785
- 49
+2, 401
- 553.7
2
- 22.98
-23.76
566.44
750
- 84
7,056
+ 1,999.2
3
+ 2.79
+ 2.01
4.00
690
-144
20. 736
- 288.0
4
+ 6.07
+ 5.29
28.09
820
- 14
196
- 74.2
5
- 14.05
-14.83
219. 04
852
+ 18
324
- 266. 4
6
- 9.93
-10.71
114.49
1,000
+ 166
27, 556
- 1,776.2
7
+ 13.68
+ 12.90
166.41
825
- 9
81
- 116.1
8
+ 15.15
+14.37
207.36
825
- 9
81
- 129.6
9
+ 27.42
+26.64
707.56
800
- 34
1,156
- 904.4
10
- 8.92
- 9.70
94.09
875
+ 41
1,681
- 397. 7
11
- 19.09
-19.87
396.01
922
+ 88
7,724
- 1,751.2
12
-1- 18.75
+17.97
324.00
810
- 24
576
- 432.0
13
- 7.07
— 7.85
62.41
1,080
+246
60,516
- 1,943.4
14
+ 13.53
+12.75
163.84
1,048
+214
45,796
+ 2,739.2
15
+ 38.15
+37.37
1.398.76
1,012
+178
31. 684
+ 6,657.2
16
+ 9.83
+ 9.05
81.00
1.000
+166
27. 556
+ 1,494.0
17
+ 19.05
+18.27
334.89
807
- 27
729
- 494.1
18
- 5.73
- 6.51
42.25
915
+ 81
6,561
- 526. 5
19
- 2.39
- 3.17
10.24
910
+ 76
5, 776
- 243.2
20
+ 10.93
+10.15
104.04
890
+ 56
3,136
+ 571.2
21
+ 9.67
+ 8.89
79.21
876
+ 42
1,764
+ 373.8
22
- 3.65
— 4.43
19.36
988
+154
23,716
- 677. 6
23
+ 49.37
+48.59
2,361.96
1,050
+216
46, 656
+10.497.6
24
- 7.28
- 8.06
65.61
798
- 36
1.296
+ 291.6
25
- 43.00
-43.78
1,918.44
675
-159
25,281
+ 6.964.2
26
+ 5.65
+ 4.87
24.01
689
-145
21.025
- 710.5
27
+ 0.59
- 0.19
.04
860
+ 26
676
5.2
28
- 7.71
- 8.49
72.25
746
- 88
7, 744
+ 748.
29
- 2.78
- 3.56
12.96
850
+ 16
256
- 57.6
30
- 1.36
- 2.14
4.41
890
+ 56
3,136
- 117.6
31
+ 6.76
+ 5.98
36.00
859
+ 25
625
+ 150.
32
+ 7.91
+ 7.13
50.41
765
- 69
4,761
- 489.9
33
+ 18.07
+ 17.29
299.29
744
- 90
8,100
- 1,557.0
34
+ 1.33
+ 0.55
.36
700
-134
17, 956
- 80.4
35
- 32.63
-33.41
1,115.56
740
- 94
8,836
+ 3,139.6
36
+ 12.97
+12.19
148.84
800
- 34
1-156
- 414.8
37
+ 11.15
+10.37
108. 16
800
- 34
1,156
- 353.6
38
- 43.71
—44.49
1,980.25
740
- 94
8.836
+ 4.1S3.0
39
+ 18.15
+17.37
302. 76
700
-134
17, 956
- 2.331.6
40
- 9.86
-10.64
112.36
785
- 49
2.401
+ 519.4
41
+ 2.18
+ 1.40
1.96
656
-178
31.684
- 249.2
42
+ 23.99
+23.21
556. 96
925
+ 91
8,281
+ 2,111.2
43
- 22.97
-23.75
566.44
766
- 68
4, 624
+ 1,618.4
44
- 12.73
-13.63
184.96
750
- 84
7.056
+ 1,142.4
45
+ 11.80
+ 11.02
121.00
805
- 29
841
- 319.0
46
- 22.90
-23.68
561.69
924
+ 90
8.100
- 2,133.0
47
+ 0.27
- 0.51
.25
800
- 34
1,156
+ 17.0
48
+ 5.37
+ 4.59
21.16
800
- 34
1,156
- 156.4
49
+ 5.33
+ 4.55
21.16
862
+ 28
784
+ 128.8
50
+ 2. 82
+ 2.04
4.00
800
- 34
1,156
68.0
51
+ 16.68
+15.90
252.81
840
+ 6
36
+ 95.4
52
- 7.07
- 7.85
60.84
840
+ 6
36
- 46.8
53
- 3.09
- 3.87
15.21
650
-184
33.S56
+ 717.6
54
- 24.04
-24.82
615.04
775
- 59
3,481
+ 1,463.2
55
- 10.45
-11.23
125.44
741
- 93
8,649
+ 1,041.6
56
+ 2.83
+ 2.05
4.00
768
- 66
4.356
- 132.0
57
- 0.09
- 0.87
.81
1,060
+226
51.076
- 203.4
58
+ 3.88
+ 3.10
9.61
900
+ 66
4,356
+ 204.6
59
- 6.42
- 7.20
51.84
793
- 41
1,681
+ 295.2
60
- 8.37
- 9.15
84.64
855
+ 21
441
- 193. 2
61
- 27.61
-28.36
806.56
850
+ 16
256
- 454.4
62
+ 1.64
+ 0.86
.81
915
+ 81
6,561
+ 72.9
63
- 0.18
- 0.96
1.00
811
- 23
529
+ 23.0
64
- 1.00
- 1.78
3.24
775
- 59
3,481
+ 106.2
65
+ 8.00
+ 7.22
51.84
742
- 92
8,464
- 662.4
66
+ 5.55
+ 4.77
23.04
827
— 7
49
33.6
67
+ 21.73
+20.95
441.00
950
+116
13, 456
+ 2,436.0
Average,
18.452.16
Average,
660. 257
+30,457.6
+0.78
<rx=$16.60
834
<r y =991bs.
:xu
+30457.6
'~mr x a v (67) (10.60) (99)
Er= ± .6745 l -^L= ± .076
+0.277
1 A plus sign before the quantity in this column indicates a profit, a minus sign a loss.
The quantities in the column headed x are given to two places of decimals, but it was
found that the use of one decimal place would give the quantities in the x- and xy
columns with sufficient accuracy, and the computations were made accordingly. Thus,
for farm No. 1, (11.3) s =127.69, and (+11.3) )-49)= —553.7.
CORRELATION AS APPLIED TO FARM-SURVEY DATA. 5
is the standard deviation of the first variable; and a y the standard
deviation of the second. The value of " r " will always be between
+1 and —1, +1 indicating perfect positive correlation, and — 1
perfect negative correlation; and to be significant, the value should
be appreciably greater than its probable error,
^.±•6745(1-^), (II)
■y/n
In the example, r=-\-.277, and its probable error is ±.076, so there
was a tendency for the heavier calves to return a greater profit, but
the correlation is by no means perfect.
PARTIAL CORRELATION.
A study in which many factors are concerned is not complete
until it is determined whether or not an apparent correlation, meas-
ured in the manner explained above, is due to the fact that each
of the two variables (or factors) under consideration is correlated
with another or even several other variables. For instance, in the
data under consideration there is apparently a high correlation be-
tween the weight of the calves and the value per hundredweight
received for them (r— +.56), and the question now arises if heavier
calves really do demand a higher price on the market. This corre-
lation might be due entirely or in part to the fact that the heavier
calves in the records obtained were sold at a later date, and that
the price of cattle in general was higher later in the season; that
Is, the correlation exhibited here might be due to the fact that both
weight and price are correlated with date of sale.
In a problem of this type, where it is necessary to consider simul-
taneously the relation between three variables and to determine the
correlation between any two, a coefficient of net or partial correlation 1
can be determined by the formula —
r r ab — r ac' r bc /T"m
Calling the three variables #, 5, and c, the terms of the formula are :
Tab-c is the coefficient of net correlation between a and h, when the
effect of c is considered ; r ah is the ordinary coefficient of gross corre-
lation between a and b and is obtained as explained above ; r ac and r bG
are the coefficients of gross correlation between a and c and b and <?,
respectively. Continuing with the example above, let us endeavor to
determine the degree of correlation between weight and value per
hundredweight, after taking into account any effect that date of sale
might have had. In other words, we seek an answer to the question :
1 Yule, G. U. : " Introduction to the Theory of Statistics," p. 229 et seq.
6 BULLETIN 504, U. S. DEPARTMENT OF AGRICULTURE.
What would have been the coefficient of correlation between weight
and price if all the calves had been sold on the same date ? Calling
the weight w, the value per hundredweight v, and the date of
sale d, the gross correlation coefficients are: 1 r wu =+.56; r vd =-\-.61;
r wd =-\-.60. Applying the formula (III), we have:
+ .56-( + .61)( + .60) _ J _
wv ' d A /(l--61 2 )(l-60 2 ) +
This value. +.31. is appreciably smaller than the value, +.56, of
the gross coefficient, showing that the apparent correlation between
weight and price is partly, but not entirely, due to their mutual
correlation with the date of sale.
This theory can be applied to the case of several variables by a
simple extension of the formula. 2
In the general case for six variables, the total number considered
in this paper —
fgb.cde ^af.cde * ^bf.cde (T\f\
a "'"V(i-^.*)(i-'V.*) ' (}
Tab'cdef is the net coefficient of correlation between a and 6, when the
four factors, c, d, e, and /, are taken into account: r a& . cde , r a f. C de, and
r^f.cde are the coefficients of correlation between the two variables
before the period in each case when e, cU and e are taken into account.
COMPUTATION OF THE COEFFICIENTS.
The first step in the arithmetic was the computation of the gross
correlation coefficients. As stated above, the variables or factors
considered were: (1) The profit or loss per head; (2) weight;
(3) value per hundredweight; (4) total value of feed consumed per
head; (5) cost per head at weaning time; and (6) date of sale.
These six variables, if taken two at a time, can be combined in 15
different ways. The first calculation was to find the coefficients of
correlation between these 15 different pairs. In Table III these are
the first values given. The effect of every other factor on these
gross coefficients was then eliminated by successive applications of
formulae III and IV. As an example, take profit and weight,
the first pair "of variables correlated. The gross coefficient was first
corrected for the effect of value per hundredweight, value of feed
consumed, initial cost, and date of sale, in turn. Then the effect
of these four factors was considered, taking them two at a time.
That is, the correlation was determined when both the value per
hundredweight and the cost of feed were taken into consideration
at the same time. When the effect of all these factors, taking them
1 See Table III : Correlation coefficients.
2 Yule, G. U. : " Introduction to the Theory of Statistics,*' p. 229 et seq.
COEBELATION AS APPLIED TO FAEM-SUEVEY DATA.
Table III. — Correlation coefficients.
Profit (p) and
Weight (to).
Profit (p) and
Value per hun-
dredweight (v).
Profit (p) and
Value of feed (/).
Profit (p) and
Cost at weaning
time (c).
Profit (p) and
Date of Sale (d).
r pw
+0.28
r pv
+0.23
r P f
-0.27
Tpe
-0,73
Tpd
+0.14
r P w.v
r pw .f
Tp w c
Tpwd
+ .18
+ .50
+ .48
+ .24
Tpvw
Tpvf
r pv .c
Tpvd
+ .10
+ .56
+ .25
+ .19
r pf .w
Tpf.v
r p f. c
r P f.d
- .50
- .58
- .38
- .37
r pc .w
r pc .v
Tpc.f
Tpc-d
- .78
- .73
- .75
- .73
r p d-w
Tpd.v
Tpd.f
Tpd-o
- .03
.00
+ .29
+ .16
Tpw.vf
TjJW.VO
Tpw.vd
Tpw.fc
Tpw.fd
Tpw-cd
+ .39
+ .43
+ .20
+ .85
+ .43
+ .49
Tpv.wf
r P vwc
Tpv.wd
r P v.fo
r pv .fd
T p vcd
+ .48
- .04
+ .12
+ .71
+ .50
+ .18
r pf .wv
Tpf.wo
Tpf.wd
Tpf.vc
Tpf.vd
Tpf-cd
- .64
- .83
- .50
- .74
- .58
- .50
Tpcwv
r pc .wf
r p c-wd
r pc -vf
Tpcvd
Tpc.fd
- .78
- .92
- .79
- .83
- .73
- .77
Tpd.wv
Tpd.wf
r p d-wo
Tpd-vf •
r p d-vo
r P d.fo
- .08
+ .06
- .18
+ .02
+ .02
+ .38
Tpw.vfc
Tpw.vfd
Tpwvcd
Tpw.fcd
+ .91
+ .42
+ .46
+ .83
Tpywfc
Tpvwfd
Tpv.wcd
Tpy.fcd
+ .83
+ .49
+ 04
+ .65
Tpf.wvc
Tpf.wvd
T p f.wcd
Tpf.vcd
- .95
- .65
- .83
- .74
Tpcwvf
r P c« wvd
r pc .wfd
r pc .vfd
- .97
- .79
- .92
- .83
Tpd.wvf
r p d.wvo
r P d-wfo
r P d.vfo
- .15
- .19
- .10
+ .06
Tpw.vfcd
+ .97
Tpv.wfod
+ .94
Tpf.wvcd
- .98
Tpc-wvfd
- .98
r P d.wvfc
+ .77
Weight (w) and
Value per hun-
dredweight (»).
Weight (w) and
Value of feed (/).
Weight (w) and
Cost at weaning
time (c).
Weight (w) and
Date of sale (d).
Value per hun-
dredweight O)
and Value of
feed (/).
r wv
+0.56
T w f
+0.51
Two
+0.07
r W d
+0.60
r V f
+0.65
rwv.p
Twv.f
Twv.0
Twv-d
+ .53
+ .35
+ 57
+ .31
r W f.p
Twf.v
Twf-c
r w f.d
+ .63
+ .23
+ .51
+ .36
rwcp
Twcv
Twcf
Twc-d
+ .42
+ .15
+ .08
+ .12
Twd-p
r.wd-v
Twd.f
Twd-o
+ .59
+ .39
+ .49
+ .60 .
r V f.p
r v f.w
rvf.o
r v f.d
+ .76
+ .51
+ .66
+ .55
Twv.pf
Twv.po
Twv.pd
r W y.fc
r W v.fd
Twvcd
+ .10
+ .53
+ .28
+ .36
+ .14
+ .32
Twf.pv
Twf.pc
Twf.pd
Twf.vc
Twf.vd
Twf-cd
+ .42
+ .86
+ .50
+ .22
+ .24
+ .36
Twcpv
Twcpf
Twc.pd
Twcvf
Twc-vd
Twcfd
+ .42
+ .81
+ .45
+ .13
+ .16
+ .12
Twd-pv
Twd.pf
Twd-po
Twd.vf
Twd.vo
Twd.fo
+ .40
+ .42
+ .61
+ .39
+ .39
+ .50
rvf.pw
Tvf.po
Tvf.pd
Tvf.wc
Tvf.wd
Tvf.cd
+ .65
+ .84
+ .68
+ .52
+ .50
+ .56
Twv.pfc
Twv.pfd
Twv.pcd
Twv.fcd
- .67
- .09
+ .27
+ .16
Twf.pvo
r w f. P vd
r W f. P cd
Twf.vcd
+ .88
+ .44
+ .80
+ .22
Twc-pvf
Twc-pvd
Twc-pfd
Twc.vfd
+ .89
+ .45
+ .79
+ .14
Twd-pvf
Twd.pvo
Twd.pfc
Twd.vfc
+ .42
+ .43
+ .36
+ .40
Tvf.pwo
Tvf.pwd
Tvf.pcd
Tvf.wcd
+ .88
+ .65
+ .77
+ .50
Twv.pfcd
- .90
Twf.pvcd
+ .97
Two.pvfd
+ .96
Twd-pvfo
+ .81
Tvf.pwod
+ .97
Value per hun-
dredweight (v)
and Cost at wean-
ing time (c).
Value per hun-
dredweight (v)
and Date of
sale (d).
Value of feed (/)
and Cost at
weaning time (c).
Value of feed (/)
and Date of
sale (d).
Cost at weaning
time (c) a'ndDate
of sale (d).
r v0 -0.09
r V d +0. 61
r f o +0. 01
Tid
+0 42
r c d -0.04
r vc .p + .12
r vd . P + .60
r f cp - .28
Tfd-p
+ .48
r c d.p + .09
rvcw - .16
rvd.w + .41
r fc .w - .03
Tfd-w
+ .16
Tcd.w — .11
r vo .f - -13
r vd .f + .49
r f0 .v + .10
Tfd.v
+ .03
r C d.v + .02
r vo .d - .08
r V d.o + .61
Tfc.d + .04
Tfd.c
+ .42
r C d.f — .05
rvc.pw - .14
Tvd.pw + .42
rfcpw - .79
Tfd.pw
+ .17
r C d.pw - .21
r vc . P f + .54
r v d. P f + .41
rfcpv — -58
Tfd-pv
+ .04
Tcd.pv + -03
r vc . P d + .08
r V d. pc + -59
rfcpd — -37
Tfd-pc
+ .53
r C d.pf + .27
rvcwf - .17
rvd.wf + .39
rfcwv + .07
Tfd-wv
- .07
Tcd-wv — .05
rvcwd - .13
rvd.we + .40
Tfcwd - .01
Tfd.wc
+ .16
r d.wf — .10
rvcfd - .12
rvd.fe + .48
rfcvd + -10
Tfd.vc
+ .03
Tcd-vf + -01
Tvcpwf + .77
r V d. P wf + .41
Tfo.pwv — . 92
Tfd.pwv
- .15
Tcd.pwv — .18
r vc . P wd — .05
rvd-pwo + .40
rfcpwd — -77
Tfd.pwo
+ .01
Tod.pwf — 13
r vc . P fd + .49
Tvd-pfo + -33
rfc P vd — .58
Tfd-pvc
+ .07
Tcd.pvf + .06
Tvcwfd - .14
rvd-wfe + .38
rfcwvd + .06
Tfd.wvo
- .06
r d-wvf — .04
Tvc P wfd + -90
Tvd.pwfe + .81
rf . P wvd — .96
Tfd.pwvc
- .82
Tcd-pwvf — .75
8 BULLETIN 504, U. S. DEPARTMENT OF AGEICULTUEE.
two at a time, had been considered, they were taken three at a time,
and finally all four were taken into account simultaneously. In
all, 260 of the coefficients were computed. Some of them, however,
are of little interest, and if this were simply a study of the data,
and not also an exposition of the method used, the computation of
some of them might have been omitted.
In order to avoid needless repetition in the tables, the different
factors are designated by letters as follows: Profit^/?, weight=w,
value per hundredweight =#, value of feed per head=/, cost at wean-
ing time=c, date of sale=c/, and the notation for the different coeffi-
cients is the same as that used in the explanation of the theory, viz :
r a i)>cd, etc, is the coefficient of correlation between a and b when c, d,
etc., are taken into account.
INTERPRETATION OF THE COEFFICIENTS.
There are four factors, namely, initial cost, value of feed consumed,
weight, and selling price, which determine almost entirely the profit
or loss to the farmer in finishing cattle for market. In fattening
baby beef animals, the weight of the calves and the value of feed
consumed both depend, to a large extent, on their age when sold
and the length of time they were on feed. Also the price per pound
received for them is rather intimately connected with the date
on which they were sold, prices having had a tendency to rise as the
season advanced. The calves for which data were gathered were
all born in the spring, went on feed in the fall or early winter, and
were sold some time during the following year. Consequently, any
one of the three factors, age, length of feeding period, and date of
sale, is a very good measure of the other two, and on account of
this only the date of sale has been considered.
If the price per pound, value of feed, initial cost, and date of sale
were constant, and if nothing else affected the profit, it would vary
directly with the weight in every case and, according to the theory,
the coefficient of correlation between the two should be +1. The
coefficient, r pw . V f Cd , obtained here is +.97. Similarly, if all things
were constant except value per hundredweight and profit, there
would be perfect positive correlation between them. The net coeffi-
cient, I'pv.wfcd* given in Table III is +.94. If all things were constant
except the value of feed consumed we should expect a high negative
correlation between it and profit, i. e., the calf that received the least
feed would return the greatest profit. The net coefficient, r P f. WVC d, is
—.98. Similarly, other things being equal, perfect negative correla-
tion should exist between initial cost and profit. The net coefficient
in this case, rpe-wn, is also —.98. An examination of the remainder
of these net coefficients, which are the last ones given in the table,
CORRELATION AS APPLIED TO FARM-SURVEY DATA.
9
will show that, with the exception of the five between date of sale
and the other variables, they are all numerically equal to or above
.90. It has been shown that part, but not all, of the correlation be-
tween weight and price was due to the date of sale, and since date
of sale is only an approximate measure of age and length of feeding
period, it would not be reasonable to expect the net correlation be-
tween it and the other variables to be perfect. The fact that all the
net coefficients except these five are so nearly -[-1 or —1, when
there was every reason to expect perfect correlation, is striking
proof of the reliability of this method of analysis as well as of the
accuracy of data such as those under consideration, and is at the
same time a very good check on the computations.
In the interpretation of the coefficients care must be taken to dis-
tinguish between subjective and relative factors, i. e., between cause
and effect. Most interest is naturally attached to determining to
just what extent each of the factors under consideration is respon-
sible for the farmer's loss or gain in his baby-beef enterprise, and
here there can be no confusion of cause and effect, for all the other
factors are necessarily causative. Throughout the remainder of the
investigation the amount of profit or loss is an effect and not a cause,
and consequently too much weight should not be given to a coefficient
in which the effect of profit has been taken into account.
THE APPARENT CORRELATIONS.
In taking up the discussion of the coefficients themselves, the ap-
parent correlations between profit and the other five factors are
first considered :
Coefficients of correlation.
Profit
and
Weight.
Profit
and
Value
per hun-
dredweight.
Profit
and
Value
of feed,
Profit
and
Cost at
weaning
time.
Profit
and
Date of
sale.
+.28
+.23
-.27
-.73
+.14
These five coefficients should show the average effect of each of the
five factors on the profit. The coefficient for profit and date of sale
(+.14) shows that the profit on the calves sold early in the season was
practically as great as on those sold later. The first three are all of
nearly the same size, but are too small to indicate more than slight
relationship. In regard to them we may say, therefore, that in the
data under consideration: (1) There was a tendency for the heavier
calves to return a greater profit; (2) there is some correlation be-
tween price per pound and profit; (3) generally speaking, the farmer
whose calves consumed feed worth more than the average made a
profit somewhat less than the average.
10
BULLETIN 504, U. S. DEPARTMENT OF AGRICULTURE.
A very high degree of correlation between profit and cost at wean-
ing time is shown by the coefficient —.73, and, as would be expected,
it is negative. The size of this coefficient as compared to the others
indicates that the cost of producing the calves and carrying them
until weaning time is by far the most important factor in determin-
ing the profit derived by an}' particular farmer from the production
of baby beef. In all of the records considered the calves were with
the cows until they went on feed, and there was no expense directly
chargeable to them. Bearing this in mind, the further statement is
justified that the cost of maintaining the breeding herd and the size
of the calf crop have considerably more to do with the profitableness
of the enterprise than the actual preparation of the calves for market.
Coefficients of correlation between weight and factors other than profit.
W 2P w sr
Value *J*
perhun- ™»
dredweight. °* leea '
Weight
and
Cost at
weaning
time.
Weight
and
Date of
sale.
+.56 +.51 1 +.07
i
+ .60
The coefficient +.07, for weight and cost at weaning time, is the
most striking one given here. Its very small size shows that there
is no connection between the cost of the calves up to the time they
went on feed and the weights at which they were sold. The cost of
a calf at weaning time is determined very largely by the manner in
which the breeding herd is handled, and consequently this coefficient
shows further that on the farms studied the calves from the herds
which were maintained at a low cost per head weighed just as much
when sold as did those from herds having a high maintenance cost.
The coefficients for weight and value of feed and weight and date of
sale are what should normally be expected. The calves that received
more feed than the average weighed more than the average, and the
ones that were sold in the latter part of the season also weighed more
than the average. The high correlation, exhibited by the coefficient
+.56, between weight and price per pound is a surprising one, but it
will be shown later that it is almost entirely due to the mutual corre-
lation of these two factors with some of the others.
The gross coefficient for value per hundredweight and value of
feed, 1 +.65, shows another apparently high correlation which may
or may not disappear when some of the other factors are taken into
account. There is no correlation between value per hundredweight
and cost at weaning time. The correlation between value per pound
and date of sale is shown by the coefficient +.61, which confirms
1 For this and all coefficients mentioned hereafter, see Table III.
COEEELATION AS APPLIED TO FAEM-SUEVEY DATA. 11
the statement already made that the price was generally higher later
in the season. The remaining gross coefficients are +.01 for total
value of feed consumed per head and cost at weaning time, +.42 for
value of feed consumed and date of sale, and —.04 for cost at wean-
ing time and date of sale. The coefficients +.01 and —.04 show that
cost at weaning time is uncorrelated with either value of feed con-
sumed or date of sale. With regard to the correlation between value
of feed consumed per head and date of sale, we may say that the
value of feed consumed is probably very nearly proportional to the
length of the feeding period, and if the actual length of time on feed
had been used here instead of its approximate measure, the date
of sale, the correlation would probably have been higher.
EFFECT OF THE OTHER FACTORS ON THE APPARENT CORRELATIONS.
The small degree of correlation present between profit and weight
is mostly due to differences in price, the coefficient being reduced
from +.28 to +.18, when the value per hundredweight is taken into
account ; that is to say, the tendency of the heavier calves to be the
more profitable is mostly due to the fact that they sold for a better
price per pound than that commanded by the smaller calves.
The coefficient r pw j is +.50, which is considerably higher than the
gross coefficient, showing that if the value of feed had been constant
while other things remained unchanged, the correlation between
profit and weight would have been greater.
The correct explanation of the size of the coefficient r VWmCJ which
is +.48, is not so apparent. It indicates, however, that if the in-
fluence of the cost at weaning time, the factor most closely related to
profit, were eliminated, the correlation between profit and weight
would be greater.
When the date of sale is taken into account, the correlation be-
tween profit and weight becomes somewhat less than the gross cor-
relation, but the difference is not enough to be significant.
The coefficients obtained for the correlation between weight and
profit, when the effect of the other factors, two at a time, is con-
sidered, are generally higher than when they are considered one at
a time. This means that if the influence of two of the factors con-
tributing to the profit or loss is eliminated, its correlation with any
of the remaining factors is higher than if the influence of but one
had been eliminated.
It is interesting to note here that the correlation between weight
and profit, even when the other factors are taken into account, is
almost entirely independent of the date of sale. The apparent cor-
relation, +.28, becomes +.24 when date of sale is taken into ac-
count. When value per pound is taken into account, the coefficient
is +.18; when price and date of sale are considered simultaneously,
12
BULLETIN" 504, U. S. DEPARTMENT OF AGRICULTURE.
*Wc=+-48, and
'pw.vc I '*<->
the coefficient is +.20. Similarly, r pw j= + .50, and r pw j d = + .43
o.cd= +.49; ^.^=+.39, and r pw . r/d =+.42
and ^.^-+-46; ^./ c =-f.85, and r pwJcd = +.83
r pw .vfc= +-91, and r pw . vfcd = +.97.
The remainder of the coefficients will not be taken up in detail, for
the same reasoning may be applied as has been used for those between
profit and weight. The notation is consistent throughout, and the
arrangement is such that any desired coefficient can be found.
There does not seem to be any relation between cost at weaning
time and any of the other factors considered, except profit, and since
cost at weaning time had more influence on profit than any of the
others, it might be of interest to know the relationship that would
have existed between profit and the other factors if the initial cost
had been constant.
The coefficients are as follows:
Tpw.C
Tpv.c-
rpf.c.
Tpd-c.
+.48
+ .25
-.38
+ .16
From these coefficients, it is evident that if the initial cost of all
the calves had been the same, the most important factor in deter-
mining the profit would have been the weight when marketed; the
other factors in the order of their importance being the total value of
feed consumed, the price per pound, and the date of sale. How-
ever, the correlation between profit and date of sale is still too small
to be important.
The statement has already been made that the apparent correla-
tion between weight and value per hundredweight (7*= +.56) is due
to the effect of other factors. A study of the coefficients obtained
when these other factors are taken into consideration shows that
when the influence of date of sale is eliminated, the coefficient is re-
duced to +.31 ; when the influence of the value of feed consumed is
eliminated, the coefficient becomes +.35; and when the two factors
are taken into account simultaneously, the coefficient is +.14. This
shows that the quantity of feed consumed per head was responsible
for nearly as much of this correlation as was the date of sale, and
that the two together account for practically the whole of it. In
other words, the value of feed consumed and the date of sale need
to be considered simultaneously here, because the later the date of
sale, the longer is the feeding period, and consequently the greater
the quantity and value of feed consumed.
The gross correlation between date of sale and value per pound is
shown by the coefficient +.61, and that between total value of feed
consumed per head and value per pound, by the coefficient +.65.
CORRELATION AS APPLIED TO FARM-SURVEY DATA. 13
These rather large coefficients become very little smaller when all the
other causal factors are taken into account. Therefore, there must
be some relationship existing between value per pound and date of
sale, and value per pound and value of feed consumed. The reason
for the correlation between value per pound and date of sale has
already been given. It is probable that the reason for the high corre-
lation between the value per pound and value of feed consumed is
due to the fact that the calves which were fed the heaviest ration,
regardless of the length of feeding period, were the fattest when
marketed, and consequently sold at a higher price. However, the
relation between the profit and value of feed consumed per head as
measured by the correlation coefficient r vf is —.27, and when the in-
fluence of a longer feeding period is taken into account by elimi-
nating the effect of date of sale the correlation is still negative
SUMMARY.
The results show that data such as those obtained by farm manage-
ment surveys can be analyzed very thoroughly by the use of the corre-
lation coefficients. It is generally known before the analysis is at-
tempted which factors are causal and which resultant, and conse-
quently there should be very little difficulty in interpreting the coeffi-
cients correctly. The coefficients of net correlation afford a very good
means of determining the net effect of each of several factors bearing
upon a result, or of eliminating the effect of other factors when it is
desired to find the true relationship existing between any two.
Although it is not possible to give a definite concrete meaning to cor-
relation coefficients, they are very concise relative measures of the
degree of relationship existing between the factors being studied.
They therefore give the investigator a single index which will show
what, by the ordinary tabular method, it takes a whole table to show.
While properly constructed tables will show whether or not any rela-
tionship exists between two factors, it is a difficult matter to deter-
mine which of two causes, say, has the greater effect on the result, and
it is impossible, without a large number of records and a great amount
of sorting and tabulation, to separate all the factors being considered
in a study and find the effect that each one would have had if the
others had not been present, or if they had been constant throughout
the investigation. If the gross coefficients of correlation between
every pair of factors have been determined, it is possible to find these
relationships by simply substituting in the formula for determining
a net coefficient from the gross coefficients, without any further refer-
ence to the records themselves. This method should be especially use-
14 BULLETIN 504, U. S. DEPARTMENT OF AGRICULTURE.
ful if only a limited number of records or observations are available,
for it does away with the necessity of sorting into many groups, with
the consequent falling off in the reliability of the averages obtained.
The analysis of the data on fattening baby beef animals showed :
(1) That for the herds considered, the cost of producing the calves
and carrying them until weaning time was by far the most important
factor in determining the profit ;
(2) That there was no connection between the cost at weaning time
and any of the other factors, for the calves which were produced
cheaply were seemingly just as good feeders and brought just as good
a price per pound as the more expensive ones ;
(3) That the weight at which the calves were sold and the date of
sale had very little effect on the profit, except for the fact that in the
two years of the records the price was higher in the latter part of the
summer, at the time when the heavier calves were put on the market ;
(4) That the calves which consumed the heaviest ration sold at
higher prices than the others, but did not return a correspondingly
greater profit, as the advanced price scarcely offset the extra value
of feed consumed.
OTHER PUBLICATIONS OF THE UNITED STATES DEPARTMENT
OF AGRICULTURE RELATING TO THE SUBJECT OF THIS
BULLETIN.
Elementary Notes on Least Squares, the Theory of Statistics and Correlation,
for Meteorology and Agriculture. (Monthly Weather Review, vol. 44, 1916, p.
551.)
Effect of Weather on Yield of Potatoes. (Monthly Weather Review, vol. 43,
1915, p. 232.)
Effect of Weather on Yield of Corn. (Monthly Weather Review, vol. 42, 1914,
p. 72.)
Methods and Cost of Growing Beef Cattle in the Corn Belt States. (Report No.
Ill, Office of the Secretary.)
15
ADDITIONAL COPIES
OF THIS PUBLICATION MAY BE PROCURED FROM
THE SUPERINTENDENT OF DOCUMENTS
GOVERNMENT PRINTING OFFICE
WASHINGTON, D. C.
AT
5 CENTS PER COPY